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Abstract 

The influence of time-dependent fitnesses on the infinite population dynamics 
of simple genetic algorithms (without crossover) is analyzed. Based on general 
arguments, a schematic phase diagram is constructed that allows one to charac- 
terize the asymptotic states in dependence on the mutation rate and the time 
scale of changes. Furthermore, the notion of regular changes is raised for which 
the population can be shown to converge towards a generalized quasispecies. 
Based on this, error thresholds and an optimal mutation rate are approximately 
calculated for a generational genetic algorithm with a moving needle-in-the- 
haystack landscape. The so found phase diagram is fully consistent with our 
general considerations. 

Genetic algorithms (GAs) as special instances of evolutionary algorithms have been 
established during the last three decades as optimization procedures, but mostly for 
static problems (see |Q for an overview and for an in-depth presentation of the field). 
In view of real-world applications, such as routing in data-nets, scheduling, robotics 
etc., which include essentially dynamic optimization problems, there are two alterna- 
tive optimization strategies. On the one hand, one can take snapshots of the system 
and search "offline" for the optimal solutions of the static situation represented by each 
of these snapshots. In this approach, the algorithm is restarted for every snapshot and 
solves the new problem from scratch. On the other hand, the optimization algorithm 
might reevaluate the real, current situation in order to reuse information gained in the 
past. In this case, the algorithm works "online". As can be argued from the analo- 
gies to natural evolution, evolutionary algorithms seem to be promising candidates for 
"online" optimization [|l], The reevaluation of the situation or environment then 
introduces a time- dependency of the fflness landscape. This time-dependency occurs 
as external to the algorithm's population and does not emerge from coevolutive inter- 
actions. Coevolutive interactions as an alternative source of time dependency in the 
fitness landscape are not within the scope of this work. 



1 



In the last years, many different methods and extensions of standard evolutionary 
algorithms for the case of time-dependent fitnesses have been analyzed on the basis of 
experiments (see for a review) but only seldom on the basis of theoretical arguments 
(see 0, 1^). To take a step into the direction of a better theoretical understanding of 
"online" evolutionary algorithms, we will study the effects of simple time dependencies 
of the fitness landscape on the dynamics of GAs (without crossover), or more generally 
saying, of populations under mutation and probabilistic selection. As we will see, it is 
possible to characterize the asymptotic states of such a system for a particular class 
of dynamic fitness landscapes that is introduced below. The asymptotic state forms 
the basis on which it can be decided whether the population is able to adapt to, or 
track, the changes in the fitness landscape. Our mathematical formalism applies to 
GAs as well as to biological self-replicating systems, since the analyzed GA model and 
Eigen's quasispecies model [|, ^ § in the molecular evolution theory (see for a 
recent review) are very similar. Hence, all introduced concepts for GAs are valid and 
relevant in analogous form for molecular evolutionary systems. 

In the following section, we will introduce the model to be analyzed and show the 
correspondence to the quasispecies model. Then, we will introduce the mathematical 
framework, based on which we will formally characterize the asymptotic state as fixed 
point. After presenting the main concepts, we will proceed with the construction of a 
phase diagram that allows to characterize the order found in the asymptotic state for 
different parameter settings. Finally, a moving needle- in-the-haystack (NiH) landscape 
is analyzed and its phase diagram, including the optimal mutation rate, is calculated. 



1 Mathematical Framework 

In order to study the influence of a time- dependent fitness landscape on the dynamics of 
a genetic algorithm (GA), we consider GAs to be discrete dynamical systems. A detailed 
introduction to the resulting dynamical systems model is given by Rowe |T^ (in this 
book). Here, we will only shortly introduce the basic concepts and the notations we 
use within the present work. 

The GA is represented as a generation operator g["^^ acting on the space A^ of 
all populations of size m for some given encoding of the population members. If we 
choose the members i to be encoded as bit-strings of length /, this state space is given 
by 

Am = {('^o, • • • ,n2i^i)/m \Y.i'^i = ^^^i ^ INo}, 

where Ui denotes the number of bit-strings in the population, which are equal to the 
binary representation of i G {0, . . . , 2' — 1}. 

The generation operator maps the present population onto the next generation, 

x(t + l) = GS™)[x(t)]. 

This is achieved by applying a sampling procedure that draws the members of the next 
generation's population x(i:-|-l) according to their expected concentrations (x(t + l)) G 



Aoo which are defined by the mixing [10, O and the selection scheme. For an infinite 



population size, the sampling acts hke the identity resulting in 

Gf)x(^)=x(t + l) = (x(t + l)). 
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Hence, Gt ■= gI°°^ represents in fact the mixing and selection scheme. For finite 
population size, (x(t + 1)) G Aqo is approximated by using the sampling process to 
obtain x(t + 1) G A^- The deviations thereby possible become larger with decreasing 
m and distort the finite population dynamics as compared to the infinite population 
case. This results in fiuctuations and epoch formation as shown in |]TD|, |TT|, |T2|. In 
the following, we will consider the infinite population limit, because it refiects the 
exact fiow of probabilities for a particular fitness landscape. In a second step, the 
fiuctuations and epoch formation introduced by the finiteness of a real population can 
be studied on the basis of that underlying probability fiow. 

The generation operator is assumed to decompose into a separate mutation and a 
separate selection operator, like 

Gt = M-S{t), (1) 

where the selection operator S{t) contains the time dependency of the fitness land- 
scape. Crossover is not considered in this work. 

Inspired by molecular evolution, and also by common usage, we assume that the 
mutation acts like fiipping each bit with probability /i. If we set the duration of one 
generation to 1, equals to the mutation rate. The mutation operator then takes on 
the form 

M= (^'^ ^ \ , i. e. Mij = _ ^)'-<^H(ij)^ 

where ® denotes the Kronecker (or canonical tensor) product and d}i{i,j) denotes the 
Hamming distance of i and j. 

To keep the description analytically tractable, we will focus on fitness-proportionate 
selection, 

S{t) ■ X = F{t) ■ x/(/(t)),, where F{t) = diag(/o(t), . . . , U^,{t)) 

and {fit))^ = J:^mx^=\\Fit)■x\\^. 

This will already provide us with some insight into the general behavior of a GA in 
time-dependent fitness landscapes. 

Since the GA corresponding to Eq. |I| applies mutation to the current population and 
selects the new population with complete replacement of the current one, it is called 
a generational GA (genGA). In addition to genGAs, steady-state GAs (ssGAs) with a 
two step reproduction process are also in common use: First, a small fraction 7 of 
the current population is chosen to produce mutants according to some heuristics. 
Second, another fraction 7 of the current population is chosen to get replaced by those 
mutants according to some other heuristics (see jl^ , p!5| , |T6| and references therein). We 



can include ssGAs into our description in an approximate fashion by simply bypassing a 
fraction (1 — 7) of the population into the selection process without mutation, whereas 
the remaining fraction 7 gets mutated before it enters the selection process. The 
generation operator then reads 

Gi = [(1-7)1 + 7M]S(t). (2) 

By varying 7 within the interval ]0,1], we can interpolate between steady-state be- 
havior (ssGA) for 7 -C 1 and generational behavior (genGA) for 7 = 1. Equation ^ is 
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only an approximation of the true generation operator for ssGAs because the heuristics 
involved in the choice of the mutated and replaced members are neglected. But in the 
next section, the heuristics are expected to play a minor role for our general conclusion 
on an inertia of ssGAs against time-variations. 

At this point, we want to review shortly the correspondence of our GA model with 
the quasispecies model, extensively studied by Eigen and coworkers ||, 0, || in the 
context of molecular evolution theory (see also |jl3| in this book). The quasispecies 
model describes a system of self-replicating entities i (e. g. RNA-, DNA-strands) with 
replication rates and an imperfect copying procedure such that mutations occur. 
For simplicity reasons, the overall concentration of molecules in the system is held 
constant by an excess flow $(t). In the above notation, the continuous model reads 

i(t) = [M-F(t)-$(t)]x(t), (3) 

where the flux needs to equal the average replication, ^(t) = (/(t))x(t), in order to 
keep the concentration vector x(t) normalized. This model might then be discretized 
via t t/6t, which unveils the similarity to a ssGA: 

x(t + l) = [(l-5t(/(t))x(t))l + 5tM-F(t)] x(t) for(5t<l. (4) 

By comparison with Eq. we can easily read off that •y = 6t (/(t))x(t) =: 7x(t)- This 
means a low (resp. high) average fitness leads to a small (resp. large) replacement - a 
property that is not wanted in the context of optimization problems, which GAs are 
usually used for, because one does not want to remain in a region of low fitness for a 
long time. Another difference to ssGAs is the fact that in the continuous Eigen model, 
selection acts only on the mutated fraction of the population - although this leads 
only to subtle differences in the dynamics of ssGAs and the Eigen model. 

Equation ^ is commonly referred to as 'continuous Eigen model' in the literature, 
because of the continuous time, and Eq. |^ is simply its discretized form which can 
be used for numerical calculations. Nonetheless, the notion 'discrete Eigen model' is 
seldom used for Eq. ^ but it is often used for the genGA, 

x(t + l) = [M-5(t)]x(t), (5) 

in the literature. This stems from the identical asymptotic behavior of Eqs. ^ and ^ 
for static fitness landscapes. However, there are differences for time-dependent fitness 
landscapes, as we will see in the following two sections. 

2 Regular Changes and Generalized Quasispecies 

In the case of a static landscape, the fixed points of the generation operator, which 
are in fact stationary states of the evolving system (if contained within A^, see [llO|), 
can be found by solving an eigenvalue problem, because of 

x = G'x MFx=(/)xX. (6) 

Let Aj and v, denote the eigenvalues and eigenvectors of MF with descending order 
Ao > ■ ■ ■ > -^2'-! s-nd ||vj||i = 1. For fi ^ 0,1 the Perron- Frobenius theorem assures 
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the non- degeneracy of the eigenvector vq to the largest eigenvalue and moreover it 
assures Vq G A^o- Often, Vq is called Perron vector. After a transformation to the 
basis of the eigenvectors {vj} it can be straightforwardly shown that x{t) converges 
to Vo for t — > oo. The population represented by vq was called the 'quasispecies' by 
Eigen, because this population does not consist of only a single dominant genotype, 
or string, but it consists of a particular stable mixture of different genotypes. 

Let us now consider time-dependent landscapes. If the time dependency is intro- 
duced simply by a single scalar factor, like 

F{t) = Fp{t) with > for alH, 

it immediately drops out of the selection operator for GAs. For the continuous Eigen 
model, we note that the eigenvectors of F{t) and F are the same and that Xi{t) = 
Xip{t). Since p{t) > 0, which is necessary to keep the fitness values positive, the 
order of the eigenvalues remains, such that MF{t) will show the same quasispecies 
Vo as MF. Contrasting to that special case, a general, individual time dependency 
of the string's fitnesses does indeed change the eigenvalues and eigenvectors of MF{t) 
compared to MF . For an arbitrary time dependency the Perron vector is constantly 
changing, and therefore, wc cannot even define a unique asymptotic state. However, 
this problem disappears for what we call regular changes. After having established a 
theory for such changes, we can then take into account more and more non-regular 
ingredients. What do we mean by "regular change"? We define it heuristically in the 
following way: a regular change is a change that happens with fixed duration r and 
obeys some deterministic rule that is the same for all change cycles. Let us express the 
latter more formally and make it more clear what we mean by "same rule of change" . 
Within a change cycle, we allow for an arbitrary time dependency of the fitness, up to 
the restriction that two different change cycles must be connected by a permutation 
of the sequence space. Thus, if the time dependency is chosen for one change cycle, e. 
g. the first change cycle starting at t = 0, it is already fixed for all other cycles, apart 
from the permutations. We will represent permutations tt from the permutation group 
©2' of the sequence space as matrices 

{P^)ij = for i, j e {0, . . . , 2' - 1}. 

The permutations of vectors x and matrices A are obtained by 

where denotes the transpose of P-,^ with the property Pj — P^-i = P~^. 

In reference to the first change cycle, we define the fitness landscape F{t) as being 
single-time- dependent, if and only if for each change cycle n G Nq there exists a 
permutation 7r„ e ©2^ such that for all cycle phases <^ e {0, . . . , r — 1} 

PnF{ip + nT)P^ ^ F{ip) (abbreviatory P„:^P^J. 

We will call each permutation P„ a. jump-rule, or simply rule, which connects F{ip-\-nT) 
and F{(p). To make predictions about the asymptotic state of the system, we need 
to relate the generation operators of different change cycles to each other. This is 
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readily achieved if the permutations Pn commute with the mutation operator M. The 
condition for this being the case is that for all 

Mij = M^„(i),^„(j) or equivalently dii{i,j) = dn{rCn{i) , T^niJ)) ■ 

Thus, the Hamming distances dii{i,j) need to be invariant under the permutations 
Pn- Geometrically this means that the fitness landscape gets "translated" or "rotated" 
by those permutations without changing the neighborhood relations. Then, we find 
for arbitrary n G N and ^9 G {0, . . . , r — 1}, 

To study the asymptotic behavior of the system, it is useful to accumulate the time 
dependency of a change cycle by introducing the r-generation operators, 

r„ := Gr-l+nr " " " G'nr for all U G Nq- 

Because of Eq. ^ all these operators are related to Fq by 

p pT^r p 

This property allows us to write the time evolution of the system in the form 

x(yp + nr) = P^_^V^Pn-i ■ ■ ■ P^ToPi Tq x((^), (8) 

where (y9 G {0, . . . , r — 1} denotes in the following always the phase within a cycle. 

Let us consider the special case of a single rule P being applied at the end of each 
change cycle, which results in P„ = (P)", e. g. imagine a fitness peak that moves at a 
constant "velocity" through the string space. We will see below that for those cases 
it is possible to identify the asymptotic state with a quasispecies in analogy to static 
fitness landscapes. Because of that, we can now define the notion of regularity of a 
fitness landscape formally in the following manner: 

A time-dependent fitness landscape F{t) is regular, if and only if: (i) the fitness 
landscape is single-time- dependent, (ii) there exists some rule P G ©2' which is applied 
at the end of each cycle such that P„ = (P)", and (iii) the rule P commutes with the 
mutation operator M. 

In this case, we get with PP^ = 1 the time evolution 

x(^ + nr) = (pT)"(Pro)"x(^). (9) 

To proceed, it is useful to permute the concentrations compatible to the rule of the 
fitness landscape. By this, concentrations are measured in reference to the fitness 
landscape structure of the start cycle n = 0. We will denote those concentrations by 
x'(t) and they are related to the concentrations x(t) by 

x'((/^ + nr) = (Pf x(^ + nr) 

= {PTorA^) and x'(y.) = x(^). ^ 

For example, if there is no time-dependency within the cycles, some a;- will for all 
cycles measure the concentration of the highest fitness string, independent of its current 
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position in string space. Thus, x'(t) evolves in a fitness landscape with periodic change, 
which can also be seen from the second line of Eq. |10[ In analogy to the static case 
Eq. ^ the calculation of fixed points of x'(t) is equivalent to an eigenvalue problem, 

x'(t + r) =x'(t) ^ Pfox'(t) = ||Pfox'(t)||i x'(t), 

where Tq is the unnormalized r-generation operator obtained from the accumulation 
of the unnormalized generation operators = MF[ip). 

The corresponding periodic quasispecies vq can be calculated for all phases ip of 
the change cycle from the Perron vector vq of PFq in the following way, 

x'(v5 + nr) ^^-^ vo(v5) = G'^_i---G'oVo for G {0, . . . , r - 1}. (11) 

To find the asymptotic states of the concentrations ^(t), we simply need to invert Eq. 
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x((^ + z/r) = (P^)V((^ + z/r) ioT u e {0,... ,r]-l}, (12) 

where rj := ordP is the order of the group element P G 62'- 

The essential reason for the existence of asymptotic states for x(t) lies in the 
finiteness of the permutation group &2i- Because of = 1, we find directly from Eq. 
P the asymptotic state 

^{^ + nr]T) = (pro)^"x(t) ^ vo(^), 

where vo(v?) is the same as in Eq. |ll|, because (PFq)'^ and PFq have the same eigen- 
vectors, in particular the same Perron vector. Moreover, we get 

^{^ + {u + n7])r) {P^MV) for z/G{0,..., 77-1}, (13) 

which is the same result as Eqs. and |l^ yield. In the limit of long strings / — > 00, 
ordP is not necessarily finite anymore. If ordP^^ — >oo, then the asymptotic states 
Eq. |13| for x(t) do not exist, but Eq. still holds. Hence, a quasispecies exists even 
in the limit / — > cxd if measured in reference to the structure of the fitness landscape. 



In conclusion, Eqs. |Tl] and |1^ represent the generalized quasispecies for the class 
of regular fitness landscapes which includes as special cases static and periodic fitness 
landscapes. In fact, the simplest case of a regular change is a periodic variation of 
the fitness values fi{t) = fi{t + r) because no permutations are involved (P = 1) and 
hence x'(t) = x(t) for all t. The quasispecies was generalized for this case already in 



and - using a slightly different formalism - in . In Section ^, we will study a 
more complicated example. 



3 Schematic Phase Diagram 

To get an intuitive feeling for the typical behavior of ssGAs and gen G As, let us consider 
some special lines in the plane spanned by the mutation rate fi and the time scale for 
changes r, as shown in Fig. |1|. The mutation operator represents only for < 1/2 a 
copying procedure with occurring errors, whereas for yU > 1/2 it systematically tends 
to invert strings, i. e. it resembles an inverter with occurring errors. Since mutation 
should introduce weaA; modifications to the strings, we will consider only /i < 1/2. 
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Figure 1: Schematic phase diagram: time-average regions due to low mutation (dark gray) 
and large inertia (light gray, left) , quasi-static region for slow changes (light gray, 
right) . 



Disorder line: For n — 1/2, the Perron vector of MF(t) is always Vq = (1, . . . , l)/2'. 
The population will therefore converge towards the disordered state. Because of 
the continuity of M in /i, we already enter a disordered phase for // fa 1/2. 

Time-average region: For /jl — 0, the mutation operator is the identity. We find as 
time evolution simply the product average over the fitness of the evolved time 
steps: 



x(i + r) 



n ^(v^) 



x(t) 



lp=t 

F{t + r, t) x(t)/ii ... Ill with F{t +T,t) = n;tr ' n^) 



Since diagonal operators commute, the order in which the F{ip) get multiplicated 
does not make any difference. For the case of a r-periodic landscape, F = 
F{t + T,t) = F{t,0) is independent of t. The quasispecies is then a linear 
superposition of the eigenvectors of the largest eigenvalue of the product averaged 
fitness landscape F - there might be more then one such eigenvector, since 
F is diagonal and the Perron-Frobenius theorem does not apply. Because of 
the continuity of M in the dynamics are governed already for < ;U ^ 1 
by the product average F. Analogous conclusions apply to those non-periodic 
landscapes for which by choosing a suitable time scale r a meaningful average 
F{t + T, t) can be defined. 
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Figure 2: Phase diagrams for (left): needle- in-the-haystack with oscillating height at fre- 
quency (J = 27r/r, (right): needle-in-the-haystack that jumps after r time steps 
to a randomly chosen nearest neighbor. 



For ssGAs, 7 is small and we find to first order in r^: 
x(t + r) = {l-T-f)F{t + T,t) 



i/j=0 mth factor from left 



(/3th factor from left 

If T7 <^ 1 holds, the time evolution is governed by F{t,t + r). For changes on 
a time scale r, we find time-averaged behavior if r ^ I/7. Thus, the width of 
the time-average region is proportional to I/7. A detailed analysis of the effect 
of the different positions of the mutation operator M within the r7-term, which 
is otherwise an arithmetic time-average, has not yet been carried out. 

Quasi-static region: If the changes happen on a time scale r very large compared 
to the average relaxation time (~ l/(Ao — Ai)) the quasispecies grows nearly 
without noticing the changes. Thus, in the quasi-static region all quasispecies 
that might be expected from the static landscapes F = F{t) will occur at some 
time during one cycle r. 



Wilke et al. raise in |T8[ the schematic phase diagram of the continuous Eigen 
model, which exhibits the same time-average phases as that for ssGAs. Their result is 
in perfect agreement with two recently, exphcitly studied time-dependent landscapes. 
First, Wilke et al. studied in [|l^ a needle-in-the-haystack (NiH) landscape with oscil- 



lating, r-periodic fitness of the needle, i. e. 

/o(i) > /i = ■ ■ ■ = /2'-i = 1 and /o(t) = (Texp{esin(27rt/r)} . 

The continuous model was represented for 5t — as Eq. ^and the periodic quasispecies 
Eq. |Tl| was calculated. Figure (left) shows the resulting phase diagram. For small r. 
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# 01 



00 01 11 10 00 
lower bits 

Figure 3: A regularly moving needle-in-the-haystack for string length I = A. In (left), 
the solid arrow represents the next jump to happen, whereas the gray and solid 
arrows all together represent the jumps that happen one after the other under 
the rule P of rotating the two lower bits as shown in (right) with rotation angle 
7r/2 at every jump. 



the error threshold is given by the one of the time-averaged landscape, whereas for large 
r, the error threshold oscillates between minimum and maximum values corresponding 
to mint/o(t) and maXi/o(t), as expected in the quasi-static regime. Second, Nilsson 
and Snoad studied in [|T^ a moving NiH that jumps randomly to one of its nearest 
neighbor strings every r time steps. The time-average of this landscape over many 
jump cycles is a totally flat or neutral landscape, which explains the extension of the 
disordered phase to small /x and small r as it is shown in Fig. ^ (right). In the quasi- 
static region, order is expected because the needle stays long enough at each position 
for a quasispecies to grow. Hence, we can understand the existence of the observed 
and calculated phase diagrams in Fig. |^ from simple arguments. In fact, they are 
special instances of the general schematic phase diagram depicted in Fig. ^ 

In the following, we will consider regularly moving NiHs and derive the infinite 
population behavior of a genGA in such landscapes. This is interesting, since gen G As 
should be considered to adapt faster to changes compared to ssGAs, as the missing 
time-average region of gen G As for small r suggests. To clarify whether a different 
phase diagram compared to Fig. |^ (right) emerges for gen G As with moving NiH, we 
will calculate the phase diagram including the optimal mutation rate that maximizes 
a lower bound for the concentration of the needle string in the population. 

4 Generational GA and a moving NiH 

In this section, we want to analyze quantitatively the asymptotic behavior of a genGA 
with NiH that moves regularly in the sense of Section |]to one of its / nearest neighbors 
every r time steps. At the end, we will also be able to comment on the case of a NiH 
that jumps randomly to one of its nearest neighbors. 

A simple example of a NiH that moves regularly to nearest neighbors is shown 
in Fig. ^ (left). Each jump corresponds to a 7r/2-rotation of the four-dimensional 
hypercube {0, l}"^ along the 1100 axis, i. e. the lower two bits are rotated as shown 
in Fig. ^ (right). We will call the set of strings {P^i | n G N} which is obtained by 
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Figure 4: The equivalence of a 27r/3-rotation along the 1' 
shift, denoted by P<^, for string length / = 3. 



1 axis and a cyclic 1-bit left- 



applying the same rule P G over and over to some initial string i G {0, 1}', the 
orhit of i under P. The period length 4 of the orbit shown in Fig. |^ (left) originates 
from the rotation angle 7r/2 and hence is independent of the string length /. The orbits 
of such rotations will always be restricted to only four different strings. For reasons 
that will become clear below, we are looking for regular movements of the needle that 
are not restricted to such a small subspace of the string space. Instead, the needle 
is supposed to move 'straight away' from previous positions in string space. Since a 
complete classification and analysis of all possible regular movements for given string 
length I and jump distance d is out of the scope of this work, we will simply give an 
example of a rule P G that generates such movements: the composition of a cyclic 
1-bit left-shift, which we denote by P<^, and an exclusive-or with - --Ol, which we 
denote by P®. For string length / < 3, P^ corresponds to a 2ti /I rotation along the 
1 ■ ■ ■ 1 axis as can be seen in Fig. ^. Moreover, the orbit of ■ ■ ■ under P^^ = P® o P< 
is shown in Fig. ^ also for / = 3. For arbitrary string length /, it is more difficult to 
visualize the action of P<^ and hence of P®<. But, it is easily verified that starting 
from all zeros ■ ■ - 0, the string with n < I ones ■ ■ ■ 01 • • • 1 will be reached after 
exactly n jumps. Moreover, the orbit of ■ ■ ■ under P^<^ has the period length 21. 
In the limit of long strings / — 00, this periodicity is broken because the needle never 
(i. e. after 00 many jumps) returns to all zeros ■ ■ ■ 0, but - as we have shown in Eq. 
|TT| using Eq. |10| - there still exists an asymptotic quasispecies. 

How does our simple GA behave with a NiH that moves according to P®<? In Fig. ^ 
two typical runs of a genGA with a NiH like that are depicted. The setting (m, /, /q, r) 
was kept fixed but two different mutation rates fi were chosen. In the case of Fig. 
H (right), the mutation rate is 'too high' to allow the population to track the movement. 
The concentration of the future needle string (solid line) cannot grow much within one 
jump cycle resulting in a decreasing initial condition (bullet) for the growth of the 
needle concentration (dotted line) in the next cycle. The population looses the peak 
- in this case after ^ 90 generations. It might happen that the population finds the 
needle again by chance (or better saying the moving needle jumps into the population), 
but the population will not be able to stably track the movement. Contrasting to that, 
the mutation rate was chosen to maximize the concentration of the future needle string 
at the end of each jump cycle (bullets) in Fig. |^ (left). Since in that case, the best 
achievable initial condition is given to each jump cycle, the movement of the needle 
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Figure 5: The orbit of • • • under P®< (black dots) for string length I = 3. The numbers 
(1), ... , (6) show the order in which the strings are visited by the needle, starting 
from 000. 
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Figure 6: Run of a genGA with regularly moving needle-in-the-haystack. The parameter 
setting was m = 1000000,/ = 20, /o = 5,r = 4, (left): /i = 0.022, (right): 
jJL = 0.055. In both cases the system evolved for 100 generations (not shown) 
without any occurring jumps in order to let a typical quasispccics grow around 
the initial needle string. In generation 20 the first jump happened and afterwards 
every r = 4 generations, solid line: xi(n, t), dotted line: xo(ra, t), bullet: jump - 
xo(n -I- 1,0) = xi(n,T). 
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fixed point x^^ = Xfix(oo, / , /o, r, fi) 




/o = 10, / = 20, r = 4, /X = /iopt 



• 0.005 0.01 0.015 0.02 0.025 
initial condition Xo{n, 0) = Xi(n — 1, r) 



,0.03 



Figure 7: The fixed point which is reached by an infinite population for n ^ oo. 



is tracked with the highest possible stability for the given setting (m, /, /o, r). As can 
be expected from Fig. ^ and is affirmed by further experiments, the bullets keep on 
fluctuating around an average value for n — > oo which is for the infinite population 
given by the quasispecies Eq. |TT|. In the following, we are going to model that system 
with some idealizations and we will calculate a lower boundary for this average value. 

We adopt the viewpoint of permuting the concentration vector compatible to the 
movement of the needle as we have done implicitly in Fig. ^ and formally in the 
definition of x'(t) in Eq. |TD|, but we drop the primes henceforth. The concentration 
of the needle string within jump cycle n is denoted by X(){n, cp) and the concentration 
of the string the needle will move to with the [n + l)th jump (i. e. the future needle 
string in jump cycle n) is denoted by xi{n,(p). The initial cycle prior to which no 
jump has occurred is ri = 0. Within a cycle, the time or generation is counted as 
phase ip G {0, . . . ,r}. Two succeeding cycles are connected by the (approximated) 
rule of change 



Xo(n + 1, 0) = Xi(n, r) and Xi(n + 1,0)~0. 



(14) 



The second relation is an approximation which is made to simplify the coming calcu- 
lations, but it holds only if the needle jumps onto a string which has not been close to 
one of the previous needle positions. Otherwise, the future needle string could already 
be present with a concentration significantly larger than 1/2' 0. In Fig. |^, we have 
chosen the rule P®< to get experimental data for a case in which this assumption is 
fulfilled. Later on we will see that we can still make useful comments about cases in 
which that approximation is partly broken. 

If we plot xo{n + 1,0) = xi{n,T) against xo{n,0), we get an intuitive picture for 
the system's evolution towards the quasispecies. The concentration XQ{n, 0) converges 
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for n ^ oo towards a fixed point, 

Xfix := lim Xo{n,0), 

as shown in Fig. |^ for a finite value of Xfix- Obviously, this fixed point depends on the 
full setting Xfix = X{i^{m,l, fo,T, fi). Since we are especially interested in the effects 
of various cycle lengths r and mutation rates /i, we keep (m, /, /o) fixed, such that 

In the remaining of this section, we will calculate Xo{n + 1, 0) = Xi{n, r) in depen- 
dence on XQ{n, 0), which is the solid curve in Fig. |^, for arbitrary parameter settings. 
From this knowledge, we will construct the phase diagram. Since we stay within one 
jump cycle, we drop n to take off some notational load. 



4.1 Derivation of the Fixed Point Concentrations 

To calculate xi(r), it is sufficient to take only xq and xi into account, because the 
assumed initial condition is xi(0) ~ 0, such that the main growth of xi is produced 
by the mutational flow from the needle. Moreover, we assume /i to be small enough 
such that terms proportional to /i^ can be neglected. This means we restrict ourselves 
to the case in which the system is mainly driven by one-bit mutations. Without 
normalization, the evolution equations then read 

yo{t+l)= {l-fiY /oyo(t) + {Ml-/^)'"'l/iW}, .^g. 
l/i(t+l)=^(l-^)'-VoZ/o(t)+ yi{t), 

where yi denote unnormalized concentrations in contrast to the normalized concentra- 
tions Xi. 

For /o(l — fi) ^ fi, which is always the case for large enough /q, we can further ne- 
glect the back- flow {■ ■ ■ } from the future needle string compared to the self- replication 
of the current needle string. The solution of Eq. |T3| is then given by 

?/o(t)= [(l-/i)7o]*Z/o(0), 

l/i(t) = /€i(^)yo(0) + (l-/i)%(0). 



with 



The coefficient Utif^) measures the growth of yi{t) starting from the initial condition 
yi(0) ~ 0, yo{0) 7^ 0. As long as yo(t) + yi(t) <^ 1, this gives already a good approxima- 
tion for the concentrations xo{t) and xi{t). But in general, this approximation breaks 
down for large t, because of the exponential growth of yo{t)- We need to normalize 
our solution, which can be done by 

x(t) = y(t)/(/)o ■ • ■ {f)t-i, where {f)t = (/o - l)xo(t) + 1. (16) 
By expressing the fitness averages in terms of yo{t), we find, after solving a simple 
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(/)o ■ ■ ■ (/).-! = 1 + (/o - 1) K=o(l - l^rfo] ^o(O) 
= l + (/o-l)A(/i)xo(0), 

where = ^ and / = (1 - /i)7o- 

Finally, we arrive at the normalized concentrations 

Xo{t) = [(1 - /x)7o]*a;o(0) / [1 + (/o - l)A(/i)xo(0)] , 

= [/€t(/i)a;o(0) + (l-/i)V(0)] / [1 + (/o - l)A(/i)xo(0)] . 

The asymptotic state can now be calculated by using the initial condition Xi(0) ~ 
0,Xq{0) 7^ and demanding Xi(r) = Xo(0). It is easily verified that for the fixed point 
follows 



4.2 Consistency in the Quasi-Static Limit 

How can we test the quality of the approximate result Eq. p!7| ? For large cycle lengths 
r, we enter the quasi-static regime, where we can approximate the population at the 
end of each cycle by the quasispecies of the corresponding static landscape. Figure ^ 
shows a comparison of the exact numerical calculations of the quasispecies (r — > oo) 
and the calculations (r = 100). In the numerical calculation, the back- 

fiow from the first error class to the needle string is included. Overall, we find the error 
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0.02 0.04 0.06 0.08 0.1 0.12 

mutation rate /i 



Figure 9: Fixed point concentration xa^ir, fi) for different values of r. For faster changes, 
the fixed point concentration rapidly drops down. 



threshold and the maximum of the fixed point concentration well represented. This 
also suggests that the deviation of the 0{fi^) approximation from the exact values 
should be small for smaller r, because those deviations add up for r ^ oo by the 
iterative procedure. 

How do the calculated fixed point concentrations compare to simulations with 
(large) finite population? In Fig. ^, the values of Xfix{oo,n) and Xfy^{4, fj.) are shown. 
For T — > oo, the deviation from the average {xi{n,(p)) (in generations — 20) is in 
fact the same as what can be read off in Fig. ^j. The deviation of Xfix(4,/x) from the 
average value (xo(^, 0)) in generations 24,28, . . . , 100 is significantly larger. This is 
caused by the neglect of all other strings' contributions apart from the current needle 
string's contribution to the flow onto the future needle string. These neglected contri- 
butions increase the average fixed point concentration measured in the experiment in 
comparison to the calculated value a:fix(T, /x). But even though there are deviations, 
we conclude that the approximately calculated value is always a lower bound for the 
exact value. In the next section, we will use this observation to derive an expression 
for the mutation rate that maximizes the average fixed point concentration. 

4.3 Phase Diagram 

In Fig. ^ the fixed point values xnxir, fi) are shown for small cycle lengths r. For the 
shown parameter setting, the region with Xfix(2,/i) > is extremely small. We notice 
that there are two error thresholds, one for 'too low' mutation rates, fith<, and one 
for 'too high' mutation rates, fith>- The intuition behind that was already given in 
Section ^. For too low mutation rates the population becomes slow and evolves in the 
averaged, fiat landscape, whereas for too high mutation rates the usual transition to 
the disordered phase takes place. In the following we will calculate the phase diagram 
starting from Eq. |l^. 
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Error Thresholds: The error thresholds are given by 



a;fix(T, /U) = Krifi) = 1. (18) 

This is the same condition as one would get using only unnormalized concentrations 
yi{t). Since yi{t) ~ near the error thresholds, the neglect of the normalization is not 
critical for the calculation of the error thresholds themselves, whereas it is important 
for the optimal mutation rate and of course for the fixed point concentration. Since Eq. 



18| cannot be solved for in closed form, we write down the corresponding recursion 
relation that converges, for a suitable starting value of /i, to the solution of Eq. [T^ in 
the limit k oo, 

(fc) 1 / A (fc-l)A (0) n 

Kh< = l/ar{l- Kh< ) > Kh< = 0> 

For /ith<, a good starting value is 0, since /ith< ~ anyway. For /ith>, the approximate 
value for the error threshold of the static (i. e. r ^ oo) landscape be chosen, 

which is obtained by calculating the fixed point [using Eq. [l^ and |l^ , 

Jo — ^ 

setting it to zero and solving for /x. 

Optimal Mutation Rate: In order to track changes with the best achievable sta- 
bility for a given setting (m, /,/o,r), the lowest possible concentration (infimum of) 
xo{n,ip) needs to be maximized, because a low concentration might result in the loss 
of the needle string in a finite population. Since for infinite populations Xo{n,ip) is 
monotonously increasing with (f it is sufficient to maximize Xo{n,0). Moreover, we 
derived above that XQ{n, 0) approaches the fixed point value xa^lr, /x) for n oo. For 
finite populations, we expect similar behavior but the strict monotony of xo{x,ip) in 
ip will be destroyed by fiuctuations and also the fixed point value itself will fiuctuate 
around some average value (xfix) as can be seen in Fig. ^. However, the safest way to 
avoid any loss of the needle string is still to maximize the average fixed point value 
(xfix)- In this sense, we define the optimal mutation rate /iopt as the one that maximizes 
(xfix) • In the previous Section we noted that our approximated infinite population 
value a;fbc(T, n) represents a lower bound for (xq^), where the maxima of the two curves 
are expected to coincide for fixed r. Thus, /Xopt can be obtained by maximization of 
Xfix (r,/i). 

We can derive an expression for the optimal mutation rate /iopt from 

-^(^,/^opt) = 

If we neglect the \i dependence of /3r(/^) in Eq. which corresponds to the approach 
1^ , we simply find i^^<Sj-, I) = l/lr. Because of fi^p^^ '^^°°> 0, this result is inconsis- 



m 



tent with the quasi-static limit, because /^opt should approach the value for which the 
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concentration of 1-mutants in the quasispecies of the corresponding static NiH land- 
scape is maximized. We conclude that the /x dependence of cannot be neglected 
for the correct optimal mutation rate, which we are now going to calculate. 

For ar ^ 1, which is the case for r ^ 1 and /o > 1, or r ^ 1 and /o ^ 1, we 
can neglect the —1 in the numerator of Xfix(T, ^) and take only a^- into account for the 
calculation of dxa^/dfi. After some algebra, we find 



A'opt 



(r-i)(/-i) 



where / = /o(l - /iopt 



Since / = /(/iopt), this equation cannot be solved in a closed form for yUopt- However, 
for r — >^ oo the equation simplifies to 



A'-opt 



(/-I)/// :/>l 
: / < 1. 



In the case / > 1, we find 



//io~pt)(l 



H'opt) 



l//o. 



By approximating (1 — yu)' ^ (1 — //i)^, we get a cubic equation. The real root of that 
equation is approximately ||20| given by (see also Fig. |10]) 



1 + 



(/ - l)/i+(l - 



3/(/-l)/x;-2/i+(3/-l)+4 



with /i+ 



/o 



-1/2 



(19) 
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Figure 11: The calculated phase diagram for a genGA with stochastically moving needle- 
in-the-haystack; two settings are shown: /o = 2, 10, for both / = 20. 



Resulting Phase Diagram: From the above, we are able to plot the phase diagram 
for our model as shown in Fig. |Tl]. Two settings are plotted. For /o = 2 (resp. 10) the 
diamonds (resp. circles) are the numerically obtained error thresholds. The solid and 
dash-dotted lines are fj,[^^ and /i^h>- To show the convergence property of /ith<,>) /^th<,> 
are plotted for /o = 10 as dashed lines. Obviously, the needed corrections to the chosen 
starting values increase for smaller r, such that more iterations are needed to describe 
the error thresholds correctly for small r. The expressions fi[^^ > are already a good 
approximation for the given settings. Representing the quasi-static limit, is plotted 
as dotted line and gets consistently approached by yUth>(T) for r ^ oo. Furthermore, 
fJ,'^^ is plotted as dash-dot-dotted line. The numerically measured values for /iopt(T) 
are shown for fo = 2 (resp. 10) as triangle (resp. squares). They approach //^^ very 
quickly already for r ^ 20 (resp. 10). 

We conclude that the above quantitative description is in good agreement with 
the numerical observations and approaches the quasi-static region in a consistent way. 
Moreover, the phase diagram fits well into the general one raised in Section ^. Even 
in the considered case of a genGA, we find - depending on the parameter setting - a 
time-averaged phase for very small r. The time- averaged phase broadens for small /q. 



4.4 Stochastically moving NiH 

Up to now, we analyzed a regularly moving NiH, for example with the rule What 
happens if the NiH is allowed to move to a randomly picked nearest neighbor, as it is 
shown in Fig. |l^ for / = 4? Two typical runs of a genGA with this fitness landscape are 
depicted in Fig. |T^. The setting (m, /, /o, r) was chosen the same as in Fig. ^ which 
allows for a direct comparison of the GA's behavior for regularly and stochastically 
moving NIHs. The overall behavior is similar. For large mutation rates, the population 
looses the needle string, whereas the moving needle is tracked stably for mutation rates 
close to the above defined optimal mutation rate. In addition, strong fluctuations in 
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Figure 12: A stochastically moving needle-in-the-haystack for string length I = 4. The 
needle is allowed to jump to one of its nearest neighbors which is chosen at 
random. 
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Figure 13: Run of a genGA with stochastically moving needle-in-the-haystack. The param- 
eter setting in (left) and (right) were the same as in Fig. ^ (left) and (right). 
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the values of xi{n,0) (lower ends of solid lines) as well as xo{n + 1,0) = xi{n,T) 
(bullets) occur in the stochastic case. These result from back-jumps . If, at the end of 
the current cycle, the needle jumps back to the string it has been to in the previous 
cycle, then xi{n, 0) = xo{n — 1, r) is significantly larger than zero. This can be seen in 
Fig. |13| (right) at generations 36, 40 and 64 and also in Fig. |13] (left) at generations 72 
and 88 (the gaps in Fig. |13| (left) correspond to xi,xq being much larger than 0.025). 
If no back-jumps occur, as in generations 24 — 72 in Fig. ^ (left), the system with 
stochastic NiH behaves nearly indistinguishable from the one with regularly moving 
NiH. Since back-jumps always increase the concentrations of the needle string in the 
very next occurring jumps, the above calculated fixed point Xfix(T, /i) is still a lower 
bound. Thus, our previous notion of optimal mutation rate remains applicable to the 
stochastically moving NiH although the assumption xi{n,0) ~ from Eq. |n| is not 
always fulfilled. 

Nilsson and Snoad fl^ did their analysis of the continuous Eigen model Eq. ^ with 
stochastic NiH in a similar way as we did above. In analogy to their calculation for the 
continuous Eigen model, we find for a genGA the optimal mutation rate fi^p^{T,l) = 
1/It which is inconsistent with the quasi-static limit (see Section [4^ ). The reason is 
the missing normalization in the work of Nilsson and Snoad. Furthermore, they could 
not derive an expression for the fixed point concentration x^^{t,^) because of that 
same reason. 



4.5 Jumps of larger Distance 

To conclude this section about the behavior of genGAs with different kinds of NIHs that 
move to nearest neighbors, let us shortly discuss jumps of Hamming distance d larger 
than one. Obviously, the analytical calculations get more complicated, because the 
C(//^)- approximation is not sufficient anymore as it connects only nearest neighbors. 
To describe jumps of a larger distance, the concentrations of some intermediate se- 
quences need to be taken into account, so that we have to solve a time evolution much 
more complicated than Eq. Hence, we cannot make simple statements for finite r. 
On the other hand, the system approaches the quasi-static region for large r and it is 



characterized by A''5^< > and /z^^ as we have seen in Fig. [Tl|. The exact quasispecies for 
r ^ oo is shown in Fig. The plotted values are error class concentrations, in order 
to make the higher error classes visible at all. Each fc-mutant has a concentration of 
Xk/ (j^) in the quasispecies state, because for a NiH the mutant's fitness depends only 
on its Hamming distance to the needle and therefore all (^) /c- mutants have the same 
concentration in the quasispecies. For finite populations, this is only true on average, 
because the asymptotic state is distorted by fiuctuations. But in the following, we 
assume that the quasispecies is still representative for the average distribution of the 
population in the asymptotic state. Then, the optimal mutation rate in the sense of 
Section ^]B| for jumps of distance d is by definition the position of the maximum of Xd- 
For d > 1/2, optimal mutation rate and error threshold become identical. Although Xd 
is maximized for mutation rates close to the error threshold it amounts, as do all other 
concentrations to only ^ 1/2' , which leads to an approximately random drift for finite 
populations. On the other hand, the chance of tracking the needle decreases even fur- 
ther for small mutation rates because then the concentration Xd becomes even smaller. 
In this sense, the quasispecies distribution, which is centered on the needle string, is 
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Figure 14: The quasispecies for the static NiH in dependence on the mutation rate /i. The 
concentrations Xi of the ith error class for i G {0, . . . , [V^J} are depicted. The 
optimal mutation rates for jumps of Hamming distance d = 1,2,3,4 are shown 
as dotted lines. 



useless for tracking the next jump ii d > 1/2. This also suggests - in agreement with 
the experimental findings of Rowe [jl3l (in this book) - that finite populations are for 
low mutation rates unable to track large jumps - in particular in the extreme case 
d = I. Only for jumps of d < 1/2 the corresponding error class concentration Xd shows 
a concentration maximum significantly above 1/2'. From the heights of the concen- 
tration maxima, we see that the difficulty of tracking the changes increases with the 
Hamming distance d of the jumps. Vice versa, the advantage a population gets after 
a jump from its structure prior to the jump decreases with increasing jump distance 
d. In addition, a mutation rate which is simultaneously optimal for more than one 
distance cannot be found. 



5 Conclusions and Future Work 

On the basis of general arguments, the phase diagrams of population-based mutation 
and probabilistic selection systems like the above genGA, ssGA and Eigen model in 
time-dependent fitness landscape can be easily understood. The notion of regular 
changes allows for an exact calculation of the asymptotic state in the sense of a gen- 
eralized, time-dependent quasispecies. For a genGA with NiH that moves regularly to 
nearest neighbors, the quasispecies can be straightforwardly calculated under simpli- 
fying assumptions. The result is a lower bound for the exact quasispecies. With that 
lower bound, we have constructed the phase diagram in the infinite population limit. 
This phase diagram is in agreement with the one raised from general arguments. 

In order to improve our analysis, we need to weaken our assumptions. In particu- 
lar, we have to overcome the restriction of taking into account only the fiow from the 
current towards the future needle string. The presence of other contributions to the 
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flow has to be modeled in some way. Another future step could be an investigation 
of the fluctuations that are introduced by the finiteness of realistic populations (dis- 
creteness of Km) around the quasispecies. This would lead to a lower boundary for the 
population size above which the needle string is not lost due to those fluctuations. 

An extension of our analysis to non-regularities like the occurrence of more than 
a single jump rule, can be achieved by averaging the time evolution Eq. |] for n ^ oo 
according to each rule's probability of being applied. A similar averaging procedure 
will be necessary if fluctuations of the cycle length r are present. Finally, an extension 
of the description to broader, more realistic peaks, as well as GA models including 
crossover and other selection schemes, are important topics for future work. 
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